Introduction

This project focuses on analysis of a football player dataset. The dataset used is sourced form the popular video game ‘FIFA 19’ which contains an extensive collection of data on various professional football players around the world.

Problem statement 1

We will be taking a look at the data and try to find the best players that money can buy you and some young prospects that you can invest in right away for a relatively small amount but who will go on to become superstars in the future. We will also try to construct the best possible team for specific formations of the game.

Problem statement 2

The world of professional football picks up a frentic pace during the 2 transfer windows of the year where players are allowed to move between different clubs. Clubs are often required to pay a premium transfer fees to secure a player that fits their needs.

Owing to the disparity in the financial strength of the clubs and also the players’ desire to move to a bigger club, it is often seen that big clubs are able to poach away talent from the smaller clubs or even their rivals.

To make sure that the selling club gets their profit for the player that they are selling, it has become commonplace to insert ‘Release Clauses’ in player contracts.

Release clauses for a player specify a certain amount that the buying club will have to pay to the selling club if they are to buy that particular player. They give a player the peace of mind knowing that a bigger club can trigger the clause by payying the specified amount allowing the player to move, while at the same time the selling club can be assured that they will be getting their profit for the time and money invested in the player and will have the funds to reinvest in other players.

The amount specified on a Release clauses can vary on different variables such as the player’s ability, their percieved market value, their age etc. For example - If a player’s perceived market value is $10000000 and their age is 18, then it would make sense for the selling club to have the player on a 5 year contract and keep the amount on release clause above their market value so as to deter other clubs from poaching their talent. Often more than not, release clauses for top players are always higher than their market value. This of course can change as player’s market value keeps on fluctuating as per their performance on the pitch week in-week out while release clause amount stays the same for the length of the contract.

We will be using regression models to predict the players release clause values based on the predictor variables available in the dataset.

As release clause figures are often not disclosed to the public, a predictor model would help the club determine an actual realistic transfer sum they will have to pay to get a player.

Taking a look at the available dataset columns.

## Observations: 18,207
## Variables: 89
## $ ï..                      <int> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,...
## $ ID                       <int> 158023, 20801, 190871, 193080, 192985...
## $ Name                     <fct> L. Messi, Cristiano Ronaldo, Neymar J...
## $ Age                      <int> 31, 33, 26, 27, 27, 27, 32, 31, 32, 2...
## $ Photo                    <fct> https://cdn.sofifa.org/players/4/19/1...
## $ Nationality              <fct> Argentina, Portugal, Brazil, Spain, B...
## $ Flag                     <fct> https://cdn.sofifa.org/flags/52.png, ...
## $ Overall                  <int> 94, 94, 92, 91, 91, 91, 91, 91, 91, 9...
## $ Potential                <int> 94, 94, 93, 93, 92, 91, 91, 91, 91, 9...
## $ Club                     <fct> FC Barcelona, Juventus, Paris Saint-G...
## $ Club.Logo                <fct> https://cdn.sofifa.org/teams/2/light/...
## $ Value                    <fct> €110.5M, €77M, €118.5M, €72M,...
## $ Wage                     <fct> €565K, €405K, €290K, €260K, â...
## $ Special                  <int> 2202, 2228, 2143, 1471, 2281, 2142, 2...
## $ Preferred.Foot           <fct> Left, Right, Right, Right, Right, Rig...
## $ International.Reputation <int> 5, 5, 5, 4, 4, 4, 4, 5, 4, 3, 4, 4, 3...
## $ Weak.Foot                <int> 4, 4, 5, 3, 5, 4, 4, 4, 3, 3, 4, 5, 3...
## $ Skill.Moves              <int> 4, 5, 5, 1, 4, 4, 4, 3, 3, 1, 4, 3, 2...
## $ Work.Rate                <fct> Medium/ Medium, High/ Low, High/ Medi...
## $ Body.Type                <fct> Messi, C. Ronaldo, Neymar, Lean, Norm...
## $ Real.Face                <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Ye...
## $ Position                 <fct> RF, ST, LW, GK, RCM, LF, RCM, RS, RCB...
## $ Jersey.Number            <int> 10, 7, 10, 1, 7, 10, 10, 9, 15, 1, 9,...
## $ Joined                   <fct> "Jul 1, 2004", "Jul 10, 2018", "Aug 3...
## $ Loaned.From              <fct> , , , , , , , , , , , , , , , , , , ,...
## $ Contract.Valid.Until     <fct> 2021, 2022, 2022, 2020, 2023, 2020, 2...
## $ Height                   <fct> 5'7, 6'2, 5'9, 6'4, 5'11, 5'8, 5'8, 6...
## $ Weight                   <fct> 159lbs, 183lbs, 150lbs, 168lbs, 154lb...
## $ LS                       <fct> 88+2, 91+3, 84+3, , 82+3, 83+3, 77+3,...
## $ ST                       <fct> 88+2, 91+3, 84+3, , 82+3, 83+3, 77+3,...
## $ RS                       <fct> 88+2, 91+3, 84+3, , 82+3, 83+3, 77+3,...
## $ LW                       <fct> 92+2, 89+3, 89+3, , 87+3, 89+3, 85+3,...
## $ LF                       <fct> 93+2, 90+3, 89+3, , 87+3, 88+3, 84+3,...
## $ CF                       <fct> 93+2, 90+3, 89+3, , 87+3, 88+3, 84+3,...
## $ RF                       <fct> 93+2, 90+3, 89+3, , 87+3, 88+3, 84+3,...
## $ RW                       <fct> 92+2, 89+3, 89+3, , 87+3, 89+3, 85+3,...
## $ LAM                      <fct> 93+2, 88+3, 89+3, , 88+3, 89+3, 87+3,...
## $ CAM                      <fct> 93+2, 88+3, 89+3, , 88+3, 89+3, 87+3,...
## $ RAM                      <fct> 93+2, 88+3, 89+3, , 88+3, 89+3, 87+3,...
## $ LM                       <fct> 91+2, 88+3, 88+3, , 88+3, 89+3, 86+3,...
## $ LCM                      <fct> 84+2, 81+3, 81+3, , 87+3, 82+3, 88+3,...
## $ CM                       <fct> 84+2, 81+3, 81+3, , 87+3, 82+3, 88+3,...
## $ RCM                      <fct> 84+2, 81+3, 81+3, , 87+3, 82+3, 88+3,...
## $ RM                       <fct> 91+2, 88+3, 88+3, , 88+3, 89+3, 86+3,...
## $ LWB                      <fct> 64+2, 65+3, 65+3, , 77+3, 66+3, 82+3,...
## $ LDM                      <fct> 61+2, 61+3, 60+3, , 77+3, 63+3, 81+3,...
## $ CDM                      <fct> 61+2, 61+3, 60+3, , 77+3, 63+3, 81+3,...
## $ RDM                      <fct> 61+2, 61+3, 60+3, , 77+3, 63+3, 81+3,...
## $ RWB                      <fct> 64+2, 65+3, 65+3, , 77+3, 66+3, 82+3,...
## $ LB                       <fct> 59+2, 61+3, 60+3, , 73+3, 60+3, 79+3,...
## $ LCB                      <fct> 47+2, 53+3, 47+3, , 66+3, 49+3, 71+3,...
## $ CB                       <fct> 47+2, 53+3, 47+3, , 66+3, 49+3, 71+3,...
## $ RCB                      <fct> 47+2, 53+3, 47+3, , 66+3, 49+3, 71+3,...
## $ RB                       <fct> 59+2, 61+3, 60+3, , 73+3, 60+3, 79+3,...
## $ Crossing                 <int> 84, 84, 79, 17, 93, 81, 86, 77, 66, 1...
## $ Finishing                <int> 95, 94, 87, 13, 82, 84, 72, 93, 60, 1...
## $ HeadingAccuracy          <int> 70, 89, 62, 21, 55, 61, 55, 77, 91, 1...
## $ ShortPassing             <int> 90, 81, 84, 50, 92, 89, 93, 82, 78, 2...
## $ Volleys                  <int> 86, 87, 84, 13, 82, 80, 76, 88, 66, 1...
## $ Dribbling                <int> 97, 88, 96, 18, 86, 95, 90, 87, 63, 1...
## $ Curve                    <int> 93, 81, 88, 21, 85, 83, 85, 86, 74, 1...
## $ FKAccuracy               <int> 94, 76, 87, 19, 83, 79, 78, 84, 72, 1...
## $ LongPassing              <int> 87, 77, 78, 51, 91, 83, 88, 64, 77, 2...
## $ BallControl              <int> 96, 94, 95, 42, 91, 94, 93, 90, 84, 1...
## $ Acceleration             <int> 91, 89, 94, 57, 78, 94, 80, 86, 76, 4...
## $ SprintSpeed              <int> 86, 91, 90, 58, 76, 88, 72, 75, 75, 6...
## $ Agility                  <int> 91, 87, 96, 60, 79, 95, 93, 82, 78, 6...
## $ Reactions                <int> 95, 96, 94, 90, 91, 90, 90, 92, 85, 8...
## $ Balance                  <int> 95, 70, 84, 43, 77, 94, 94, 83, 66, 4...
## $ ShotPower                <int> 85, 95, 80, 31, 91, 82, 79, 86, 79, 2...
## $ Jumping                  <int> 68, 95, 61, 67, 63, 56, 68, 69, 93, 7...
## $ Stamina                  <int> 72, 88, 81, 43, 90, 83, 89, 90, 84, 4...
## $ Strength                 <int> 59, 79, 49, 64, 75, 66, 58, 83, 83, 7...
## $ LongShots                <int> 94, 93, 82, 12, 91, 80, 82, 85, 59, 1...
## $ Aggression               <int> 48, 63, 56, 38, 76, 54, 62, 87, 88, 3...
## $ Interceptions            <int> 22, 29, 36, 30, 61, 41, 83, 41, 90, 1...
## $ Positioning              <int> 94, 95, 89, 12, 87, 87, 79, 92, 60, 1...
## $ Vision                   <int> 94, 82, 87, 68, 94, 89, 92, 84, 63, 7...
## $ Penalties                <int> 75, 85, 81, 40, 79, 86, 82, 85, 75, 1...
## $ Composure                <int> 96, 95, 94, 68, 88, 91, 84, 85, 82, 7...
## $ Marking                  <int> 33, 28, 27, 15, 68, 34, 60, 62, 87, 2...
## $ StandingTackle           <int> 28, 31, 24, 21, 58, 27, 76, 45, 92, 1...
## $ SlidingTackle            <int> 26, 23, 33, 13, 51, 22, 73, 38, 91, 1...
## $ GKDiving                 <int> 6, 7, 9, 90, 15, 11, 13, 27, 11, 86, ...
## $ GKHandling               <int> 11, 11, 9, 85, 13, 12, 9, 25, 8, 92, ...
## $ GKKicking                <int> 15, 15, 15, 87, 5, 6, 7, 31, 9, 78, 1...
## $ GKPositioning            <int> 14, 14, 15, 88, 10, 8, 14, 33, 7, 88,...
## $ GKReflexes               <int> 8, 11, 11, 94, 13, 8, 9, 37, 11, 89, ...
## $ Release.Clause           <fct> €226.5M, €127.1M, €228.1M, €1...

Taking a look at the entries in the dataset

It can seen that while the datset is extensive in providing information about player attributes, there are columns that can be dropped from the dataset for analysis.

Dropping the columns not needed for analysis

Data cleaning

The data in the player value, wages and release clause columns contains special characters and letters like ‘K’ and ‘M’ to describe their value. The data needs to pre processed before it can be worked upon. We will apply a function to clean the data.

with this we have now extracted the numerical values from the columns.

Exploring the dataset

Count of players by postion

Top 15 nations with most players

Count of players by preferred foot

First Problem statement

In order to build a team of superstars, lets take a look at the teams that currently have the top players and their squad values.

We will try to identify how clubs invest their money in players

As we can see, Juventus invest big money but they also tend to do it wisely, while Inter and Napoli have been keeping the Overall ratings high while keeping a check on the wage budget.

The business patterns can also be seen when the club squad values are mapped against the average overalll rating of the players.

However, clubs also pay big money for talented young players, this inflates the squad values while keeping the average overall rating low, as young players are expected to grow over time and their current ratings are lower as they are still developing.

We can see that Real Madrid and Barcelona are betting big on youth, as their player potential shows a remarkable jump as compared to their current overall ratings.

top_n(squadval,n=20, clubsquadvalue)%>%ggplot(squadval,mapping = aes(x=clubsquadvalue,y=avgpotential))+geom_point(aes(color=Club,size=4))+geom_text(aes(label=Club),hjust=0, vjust=0)+labs(y="Average Potential rating of players", x = "Average wage amount")

##### Finding the next Superstar

Amongst the young crop of players, there are a few with potential to become world beaters. We will try to find these talents.

We will assign weights to their Potential rating and current rating so as to find players with maximum potential growth

Which country has the best crop of young players?

Best young prospects that won’t be extremely costly and will have the most potential to grow.

Forming the best team for a traditional 4-4-2 formation

we will keep age less than 28 as that is when a footballer is at their peak

Problem statement 2

We will now use the data to predict the release clause amounts for the players.

We will drop the unnecessary columns and convert categorical variables into dummy variables

We will need to standardize the data before we can work on it.

We will now create a function to standardize the variables.

Following is the data after standardization

The data will be partitioned into Training and Validation data sets Following is a snapshot of the training data

We will first try using a regression tree to predict the release clause values.

As we can see the value of the player and their overall rating are the most important factors while determing the release clause value as per the regression tree.

## Call:
## rpart(formula = newreleaseclause ~ ., data = releasetrain, method = "anova", 
##     cp = 0.001, minbucket = 1)
##   n= 9985 
## 
##             CP nsplit  rel error     xerror        xstd
## 1  0.580000477      0 1.00000000 1.00015510 0.091077672
## 2  0.154700524      1 0.41999952 0.42820509 0.044590694
## 3  0.141327354      2 0.26529900 0.27973783 0.041209285
## 4  0.030605267      3 0.12397165 0.13273581 0.013593605
## 5  0.019163415      4 0.09336638 0.10780133 0.012848549
## 6  0.016445699      5 0.07420296 0.08022958 0.009315464
## 7  0.016370658      6 0.05775726 0.06946843 0.009219572
## 8  0.006439943      7 0.04138661 0.05489065 0.006335056
## 9  0.005765986      8 0.03494666 0.04998395 0.005690073
## 10 0.002979927      9 0.02918068 0.04518356 0.006087523
## 11 0.002865275     10 0.02620075 0.03678988 0.005192684
## 12 0.002761232     11 0.02333547 0.03528398 0.005178519
## 13 0.001979836     12 0.02057424 0.03213557 0.005004362
## 14 0.001855942     13 0.01859441 0.02922901 0.004960323
## 15 0.001499552     14 0.01673846 0.02709064 0.004959359
## 16 0.001000000     15 0.01523891 0.02557207 0.004919094
## 
## Variable importance
##     newvalue   newoverall      newwage newpotential 
##           44           28           17           11 
## 
## Node number 1: 9985 observations,    complexity param=0.5800005
##   mean=0.02027269, MSE=0.002357297 
##   left son=2 (9710 obs) right son=3 (275 obs)
##   Primary splits:
##       newvalue     < 0.1286184   to the left,  improve=0.58000050, (0 missing)
##       newoverall   < 0.71875     to the left,  improve=0.53973080, (0 missing)
##       newwage      < 0.1019504   to the left,  improve=0.43042300, (0 missing)
##       newpotential < 0.7765957   to the left,  improve=0.41667590, (0 missing)
##       newage       < 0.1551724   to the left,  improve=0.01306933, (0 missing)
##   Surrogate splits:
##       newoverall   < 0.71875     to the left,  agree=0.992, adj=0.695, (0 split)
##       newwage      < 0.1090426   to the left,  agree=0.983, adj=0.396, (0 split)
##       newpotential < 0.7978723   to the left,  agree=0.980, adj=0.265, (0 split)
## 
## Node number 2: 9710 observations,    complexity param=0.1547005
##   mean=0.01405, MSE=0.0005032007 
##   left son=4 (8576 obs) right son=5 (1134 obs)
##   Primary splits:
##       newvalue     < 0.03831547  to the left,  improve=0.74523570, (0 missing)
##       newoverall   < 0.5729167   to the left,  improve=0.66740390, (0 missing)
##       newwage      < 0.02216312  to the left,  improve=0.47675450, (0 missing)
##       newpotential < 0.5851064   to the left,  improve=0.37794150, (0 missing)
##       newage       < 0.1551724   to the left,  improve=0.02111057, (0 missing)
##   Surrogate splits:
##       newoverall   < 0.5729167   to the left,  agree=0.968, adj=0.727, (0 split)
##       newwage      < 0.03102837  to the left,  agree=0.925, adj=0.358, (0 split)
##       newpotential < 0.6702128   to the left,  agree=0.899, adj=0.133, (0 split)
## 
## Node number 3: 275 observations,    complexity param=0.1413274
##   mean=0.2399902, MSE=0.01818075 
##   left son=6 (250 obs) right son=7 (25 obs)
##   Primary splits:
##       newvalue     < 0.4113427   to the left,  improve=0.66534070, (0 missing)
##       newoverall   < 0.8229167   to the left,  improve=0.47946730, (0 missing)
##       newwage      < 0.3306738   to the left,  improve=0.47894950, (0 missing)
##       newpotential < 0.8829787   to the left,  improve=0.41702950, (0 missing)
##       Position_RF  < 0.5         to the left,  improve=0.07250062, (0 missing)
##   Surrogate splits:
##       newoverall   < 0.8645833   to the left,  agree=0.960, adj=0.56, (0 split)
##       newwage      < 0.339539    to the left,  agree=0.953, adj=0.48, (0 split)
##       newpotential < 0.9042553   to the left,  agree=0.938, adj=0.32, (0 split)
## 
## Node number 4: 8576 observations,    complexity param=0.01637066
##   mean=0.007008231, MSE=6.253813e-05 
##   left son=8 (7214 obs) right son=9 (1362 obs)
##   Primary splits:
##       newvalue     < 0.01384083  to the left,  improve=0.71845390, (0 missing)
##       newoverall   < 0.4895833   to the left,  improve=0.60839910, (0 missing)
##       newwage      < 0.007978723 to the left,  improve=0.38225200, (0 missing)
##       newpotential < 0.5         to the left,  improve=0.31291270, (0 missing)
##       newage       < 0.2241379   to the left,  improve=0.02272868, (0 missing)
##   Surrogate splits:
##       newoverall   < 0.4895833   to the left,  agree=0.951, adj=0.689, (0 split)
##       newwage      < 0.01329787  to the left,  agree=0.887, adj=0.286, (0 split)
##       newpotential < 0.712766    to the left,  agree=0.843, adj=0.011, (0 split)
##       Position_RAM < 0.5         to the left,  agree=0.842, adj=0.002, (0 split)
## 
## Node number 5: 1134 observations,    complexity param=0.01916341
##   mean=0.06730413, MSE=0.0006247543 
##   left son=10 (744 obs) right son=11 (390 obs)
##   Primary splits:
##       newvalue     < 0.0737615   to the left,  improve=0.63666800, (0 missing)
##       newpotential < 0.6489362   to the left,  improve=0.34945480, (0 missing)
##       newoverall   < 0.6354167   to the left,  improve=0.31826750, (0 missing)
##       newwage      < 0.03634752  to the left,  improve=0.07173778, (0 missing)
##       newage       < 0.2586207   to the right, improve=0.04520521, (0 missing)
##   Surrogate splits:
##       newoverall   < 0.6354167   to the left,  agree=0.814, adj=0.459, (0 split)
##       newpotential < 0.6702128   to the left,  agree=0.763, adj=0.310, (0 split)
##       newwage      < 0.08421986  to the left,  agree=0.698, adj=0.123, (0 split)
##       Position_CM  < 0.5         to the left,  agree=0.660, adj=0.013, (0 split)
##       Position_RS  < 0.5         to the left,  agree=0.659, adj=0.008, (0 split)
## 
## Node number 6: 250 observations,    complexity param=0.03060527
##   mean=0.2052103, MSE=0.004364888 
##   left son=12 (154 obs) right son=13 (96 obs)
##   Primary splits:
##       newvalue     < 0.2130138   to the left,  improve=0.66015430, (0 missing)
##       newoverall   < 0.7395833   to the left,  improve=0.33241170, (0 missing)
##       newpotential < 0.7765957   to the left,  improve=0.28453130, (0 missing)
##       newwage      < 0.141844    to the left,  improve=0.19918770, (0 missing)
##       Position_RF  < 0.5         to the left,  improve=0.01558079, (0 missing)
##   Surrogate splits:
##       newoverall   < 0.7604167   to the left,  agree=0.776, adj=0.417, (0 split)
##       newpotential < 0.7978723   to the left,  agree=0.764, adj=0.385, (0 split)
##       newwage      < 0.1746454   to the left,  agree=0.712, adj=0.250, (0 split)
##       Position_CM  < 0.5         to the left,  agree=0.632, adj=0.042, (0 split)
##       Position_CF  < 0.5         to the left,  agree=0.624, adj=0.021, (0 split)
## 
## Node number 7: 25 observations,    complexity param=0.0164457
##   mean=0.5877889, MSE=0.02327905 
##   left son=14 (16 obs) right son=15 (9 obs)
##   Primary splits:
##       newvalue     < 0.5695839   to the left,  improve=0.6651346, (0 missing)
##       newoverall   < 0.9479167   to the left,  improve=0.6239562, (0 missing)
##       newpotential < 0.9468085   to the left,  improve=0.5090814, (0 missing)
##       Position_RF  < 0.5         to the left,  improve=0.2938695, (0 missing)
##       newwage      < 0.9024823   to the left,  improve=0.2938695, (0 missing)
##   Surrogate splits:
##       newpotential < 0.9468085   to the left,  agree=0.84, adj=0.556, (0 split)
##       Position_LW  < 0.5         to the left,  agree=0.76, adj=0.333, (0 split)
##       newoverall   < 0.9479167   to the left,  agree=0.72, adj=0.222, (0 split)
##       newwage      < 0.1888298   to the right, agree=0.72, adj=0.222, (0 split)
## 
## Node number 8: 7214 observations,    complexity param=0.001855942
##   mean=0.004095688, MSE=9.623507e-06 
##   left son=16 (5052 obs) right son=17 (2162 obs)
##   Primary splits:
##       newvalue     < 0.005717782 to the left,  improve=0.62924150, (0 missing)
##       newoverall   < 0.40625     to the left,  improve=0.37301160, (0 missing)
##       newpotential < 0.4361702   to the left,  improve=0.31100720, (0 missing)
##       newwage      < 0.004432624 to the left,  improve=0.25283790, (0 missing)
##       Position_GK  < 0.5         to the right, improve=0.02122705, (0 missing)
##   Surrogate splits:
##       newoverall   < 0.4270833   to the left,  agree=0.856, adj=0.518, (0 split)
##       newwage      < 0.004432624 to the left,  agree=0.777, adj=0.256, (0 split)
##       newpotential < 0.5425532   to the left,  agree=0.736, adj=0.118, (0 split)
##       Position_RAM < 0.5         to the left,  agree=0.701, adj=0.002, (0 split)
##       Position_RS  < 0.5         to the left,  agree=0.701, adj=0.001, (0 split)
## 
## Node number 9: 1362 observations,    complexity param=0.001979836
##   mean=0.02243487, MSE=5.989487e-05 
##   left son=18 (892 obs) right son=19 (470 obs)
##   Primary splits:
##       newvalue     < 0.02650013  to the left,  improve=0.57124840, (0 missing)
##       newpotential < 0.5212766   to the left,  improve=0.26211000, (0 missing)
##       newoverall   < 0.53125     to the left,  improve=0.17630320, (0 missing)
##       newwage      < 0.01861702  to the left,  improve=0.05747971, (0 missing)
##       newage       < 0.3275862   to the right, improve=0.04892679, (0 missing)
##   Surrogate splits:
##       newoverall   < 0.53125     to the left,  agree=0.725, adj=0.204, (0 split)
##       newpotential < 0.5851064   to the left,  agree=0.675, adj=0.057, (0 split)
##       newwage      < 0.04166667  to the left,  agree=0.672, adj=0.049, (0 split)
##       Position_RS  < 0.5         to the left,  agree=0.657, adj=0.006, (0 split)
##       newage       < 0.6724138   to the left,  agree=0.657, adj=0.006, (0 split)
## 
## Node number 10: 744 observations,    complexity param=0.002761232
##   mean=0.05286447, MSE=0.0001677509 
##   left son=20 (460 obs) right son=21 (284 obs)
##   Primary splits:
##       newvalue     < 0.05688244  to the left,  improve=0.52074790, (0 missing)
##       newpotential < 0.5851064   to the left,  improve=0.13866300, (0 missing)
##       newoverall   < 0.59375     to the left,  improve=0.08081085, (0 missing)
##       newage       < 0.4310345   to the right, improve=0.04713325, (0 missing)
##       newwage      < 0.02748227  to the left,  improve=0.03198444, (0 missing)
##   Surrogate splits:
##       newoverall   < 0.6145833   to the left,  agree=0.691, adj=0.190, (0 split)
##       newpotential < 0.6276596   to the left,  agree=0.645, adj=0.070, (0 split)
##       newwage      < 0.06826241  to the left,  agree=0.634, adj=0.042, (0 split)
##       Position_CAM < 0.5         to the left,  agree=0.620, adj=0.004, (0 split)
## 
## Node number 11: 390 observations,    complexity param=0.002979927
##   mean=0.09485056, MSE=0.0003400095 
##   left son=22 (236 obs) right son=23 (154 obs)
##   Primary splits:
##       newvalue     < 0.09908009  to the left,  improve=0.52894740, (0 missing)
##       newpotential < 0.6702128   to the left,  improve=0.12634790, (0 missing)
##       newoverall   < 0.65625     to the left,  improve=0.08641443, (0 missing)
##       newage       < 0.1896552   to the right, improve=0.02252821, (0 missing)
##       newwage      < 0.1400709   to the left,  improve=0.02003327, (0 missing)
##   Surrogate splits:
##       newoverall   < 0.6770833   to the left,  agree=0.708, adj=0.260, (0 split)
##       newpotential < 0.7765957   to the left,  agree=0.651, adj=0.117, (0 split)
##       newwage      < 0.1125887   to the left,  agree=0.638, adj=0.084, (0 split)
##       Position_GK  < 0.5         to the left,  agree=0.610, adj=0.013, (0 split)
##       newage       < 0.0862069   to the right, agree=0.610, adj=0.013, (0 split)
## 
## Node number 12: 154 observations,    complexity param=0.002865275
##   mean=0.162828, MSE=0.0008234592 
##   left son=24 (87 obs) right son=25 (67 obs)
##   Primary splits:
##       newvalue     < 0.1665963   to the left,  improve=0.53182160, (0 missing)
##       newwage      < 0.179078    to the left,  improve=0.15706880, (0 missing)
##       newpotential < 0.712766    to the left,  improve=0.15126040, (0 missing)
##       newoverall   < 0.7395833   to the left,  improve=0.11238370, (0 missing)
##       newage       < 0.6034483   to the right, improve=0.02506721, (0 missing)
##   Surrogate splits:
##       newoverall   < 0.7395833   to the left,  agree=0.708, adj=0.328, (0 split)
##       newwage      < 0.179078    to the left,  agree=0.688, adj=0.284, (0 split)
##       newpotential < 0.7978723   to the left,  agree=0.617, adj=0.119, (0 split)
##       Position_CB  < 0.5         to the left,  agree=0.578, adj=0.030, (0 split)
##       Position_RM  < 0.5         to the left,  agree=0.578, adj=0.030, (0 split)
## 
## Node number 13: 96 observations,    complexity param=0.005765986
##   mean=0.2731986, MSE=0.002542024 
##   left son=26 (74 obs) right son=27 (22 obs)
##   Primary splits:
##       newvalue     < 0.3037387   to the left,  improve=0.55614110, (0 missing)
##       newoverall   < 0.78125     to the left,  improve=0.14490330, (0 missing)
##       newwage      < 0.4015957   to the left,  improve=0.09924049, (0 missing)
##       newpotential < 0.7765957   to the left,  improve=0.08751230, (0 missing)
##       Position_LCM < 0.5         to the left,  improve=0.04234820, (0 missing)
##   Surrogate splits:
##       newoverall   < 0.8020833   to the left,  agree=0.823, adj=0.227, (0 split)
##       newwage      < 0.3129433   to the left,  agree=0.802, adj=0.136, (0 split)
##       Position_LB  < 0.5         to the left,  agree=0.781, adj=0.045, (0 split)
##       newpotential < 0.8829787   to the left,  agree=0.781, adj=0.045, (0 split)
## 
## Node number 14: 16 observations
##   mean=0.4944637, MSE=0.002229789 
## 
## Node number 15: 9 observations,    complexity param=0.006439943
##   mean=0.7537002, MSE=0.01768967 
##   left son=30 (7 obs) right son=31 (2 obs)
##   Primary splits:
##       newoverall  < 0.9479167   to the left,  improve=0.9520992, (0 missing)
##       newvalue    < 0.8417588   to the left,  improve=0.9520992, (0 missing)
##       newwage     < 0.9024823   to the left,  improve=0.4045954, (0 missing)
##       Position_RF < 0.5         to the left,  improve=0.4045954, (0 missing)
##       newage      < 0.3275862   to the left,  improve=0.2148805, (0 missing)
##   Surrogate splits:
##       newvalue < 0.8417588   to the left,  agree=1, adj=1, (0 split)
## 
## Node number 16: 5052 observations
##   mean=0.002485889, MSE=2.177476e-06 
## 
## Node number 17: 2162 observations
##   mean=0.007857345, MSE=6.817261e-06 
## 
## Node number 18: 892 observations
##   mean=0.01818893, MSE=2.053715e-05 
## 
## Node number 19: 470 observations
##   mean=0.03049312, MSE=3.544054e-05 
## 
## Node number 20: 460 observations
##   mean=0.04552058, MSE=6.895904e-05 
## 
## Node number 21: 284 observations
##   mean=0.06475951, MSE=9.891795e-05 
## 
## Node number 22: 236 observations
##   mean=0.08401738, MSE=0.0001400493 
## 
## Node number 23: 154 observations
##   mean=0.1114521, MSE=0.000190985 
## 
## Node number 24: 87 observations
##   mean=0.1444634, MSE=0.0003449584 
## 
## Node number 25: 67 observations
##   mean=0.1866746, MSE=0.000438203 
## 
## Node number 26: 74 observations,    complexity param=0.001499552
##   mean=0.2526974, MSE=0.001102119 
##   left son=52 (45 obs) right son=53 (29 obs)
##   Primary splits:
##       newvalue     < 0.2594312   to the left,  improve=0.43277650, (0 missing)
##       newwage      < 0.03280142  to the left,  improve=0.11266980, (0 missing)
##       newoverall   < 0.7395833   to the left,  improve=0.09325620, (0 missing)
##       newage       < 0.362069    to the right, improve=0.08829045, (0 missing)
##       newpotential < 0.7978723   to the left,  improve=0.07997931, (0 missing)
##   Surrogate splits:
##       newoverall  < 0.7604167   to the left,  agree=0.662, adj=0.138, (0 split)
##       newwage     < 0.1462766   to the left,  agree=0.662, adj=0.138, (0 split)
##       Position_CM < 0.5         to the left,  agree=0.635, adj=0.069, (0 split)
##       Position_RW < 0.5         to the left,  agree=0.622, adj=0.034, (0 split)
##       Position_RB < 0.5         to the left,  agree=0.622, adj=0.034, (0 split)
## 
## Node number 27: 22 observations
##   mean=0.3421569, MSE=0.001216364 
## 
## Node number 30: 7 observations
##   mean=0.684331, MSE=0.001085934 
## 
## Node number 31: 2 observations
##   mean=0.9964926, MSE=1.230209e-05 
## 
## Node number 52: 45 observations
##   mean=0.2351651, MSE=0.0006787902 
## 
## Node number 53: 29 observations
##   mean=0.2799027, MSE=0.0005419095

Evaluating the performance of the tree
##                 ME    RMSE      MAE       MPE     MAPE
## Test set -20503.97 2041053 740406.5 -41.69739 62.78034

Linear regression

We will now try a linear regression model

## Warning in predict.lm(releaselm, releasevalid): prediction from a rank-
## deficient fit may be misleading
## 
## Call:
## lm(formula = newreleaseclause ~ ., data = releasetrain)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.061268 -0.000839  0.000038  0.000945  0.071266 
## 
## Coefficients: (2 not defined because of singularities)
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5.765e-05  7.874e-04   0.073 0.941638    
## Position_RF  -4.440e-03  1.824e-03  -2.435 0.014926 *  
## Position_ST  -2.025e-04  7.362e-04  -0.275 0.783285    
## Position_LW   4.657e-04  8.049e-04   0.579 0.562863    
## Position_GK   1.155e-04  7.362e-04   0.157 0.875330    
## Position_RCM -8.497e-05  7.957e-04  -0.107 0.914962    
## Position_LF  -2.620e-02  2.172e-03 -12.065  < 2e-16 ***
## Position_RS  -2.995e-04  8.553e-04  -0.350 0.726202    
## Position_RCB  1.172e-04  7.660e-04   0.153 0.878448    
## Position_LCM -8.277e-05  8.002e-04  -0.103 0.917611    
## Position_CB   1.275e-04  7.382e-04   0.173 0.862896    
## Position_LDM -3.681e-04  8.465e-04  -0.435 0.663664    
## Position_CAM -5.727e-04  7.542e-04  -0.759 0.447664    
## Position_CDM  2.007e-04  7.527e-04   0.267 0.789728    
## Position_LS  -1.465e-03  8.634e-04  -1.697 0.089730 .  
## Position_LCB  1.573e-04  7.688e-04   0.205 0.837877    
## Position_RM  -1.340e-04  7.485e-04  -0.179 0.857971    
## Position_LAM  8.798e-04  1.737e-03   0.507 0.612414    
## Position_LM   3.373e-04  7.490e-04   0.450 0.652504    
## Position_LB   2.437e-04  7.429e-04   0.328 0.742854    
## Position_RDM  6.868e-04  8.343e-04   0.823 0.410386    
## Position_RW  -9.974e-04  8.088e-04  -1.233 0.217560    
## Position_CM  -2.231e-04  7.435e-04  -0.300 0.764167    
## Position_RB   2.666e-04  7.448e-04   0.358 0.720375    
## Position_RAM  2.388e-05  1.613e-03   0.015 0.988186    
## Position_CF  -1.223e-03  1.085e-03  -1.127 0.259618    
## Position_RWB  2.301e-04  1.009e-03   0.228 0.819628    
## Position_LWB         NA         NA      NA       NA    
## Position_            NA         NA      NA       NA    
## newage        9.092e-05  6.371e-04   0.143 0.886533    
## newoverall   -9.708e-03  9.624e-04 -10.087  < 2e-16 ***
## newpotential  6.372e-03  9.121e-04   6.986    3e-12 ***
## newvalue      1.011e+00  2.320e-03 435.884  < 2e-16 ***
## newwage       8.973e-03  2.485e-03   3.610 0.000308 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.004993 on 9953 degrees of freedom
## Multiple R-squared:  0.9895, Adjusted R-squared:  0.9894 
## F-statistic: 3.014e+04 on 31 and 9953 DF,  p-value: < 2.2e-16
## Start:  AIC=-105803.8
## newreleaseclause ~ Position_RF + Position_ST + Position_LW + 
##     Position_GK + Position_RCM + Position_LF + Position_RS + 
##     Position_RCB + Position_LCM + Position_CB + Position_LDM + 
##     Position_CAM + Position_CDM + Position_LS + Position_LCB + 
##     Position_RM + Position_LAM + Position_LM + Position_LB + 
##     Position_RDM + Position_RW + Position_CM + Position_RB + 
##     Position_RAM + Position_CF + Position_RWB + Position_LWB + 
##     Position_ + newage + newoverall + newpotential + newvalue + 
##     newwage
## 
## 
## Step:  AIC=-105803.8
## newreleaseclause ~ Position_RF + Position_ST + Position_LW + 
##     Position_GK + Position_RCM + Position_LF + Position_RS + 
##     Position_RCB + Position_LCM + Position_CB + Position_LDM + 
##     Position_CAM + Position_CDM + Position_LS + Position_LCB + 
##     Position_RM + Position_LAM + Position_LM + Position_LB + 
##     Position_RDM + Position_RW + Position_CM + Position_RB + 
##     Position_RAM + Position_CF + Position_RWB + Position_LWB + 
##     newage + newoverall + newpotential + newvalue + newwage
## 
## 
## Step:  AIC=-105803.8
## newreleaseclause ~ Position_RF + Position_ST + Position_LW + 
##     Position_GK + Position_RCM + Position_LF + Position_RS + 
##     Position_RCB + Position_LCM + Position_CB + Position_LDM + 
##     Position_CAM + Position_CDM + Position_LS + Position_LCB + 
##     Position_RM + Position_LAM + Position_LM + Position_LB + 
##     Position_RDM + Position_RW + Position_CM + Position_RB + 
##     Position_RAM + Position_CF + Position_RWB + newage + newoverall + 
##     newpotential + newvalue + newwage
## 
##                Df Sum of Sq    RSS     AIC
## - Position_RAM  1    0.0000 0.2481 -105806
## - Position_LCM  1    0.0000 0.2481 -105806
## - Position_RCM  1    0.0000 0.2481 -105806
## - newage        1    0.0000 0.2481 -105806
## - Position_RCB  1    0.0000 0.2481 -105806
## - Position_GK   1    0.0000 0.2481 -105806
## - Position_CB   1    0.0000 0.2481 -105806
## - Position_RM   1    0.0000 0.2481 -105806
## - Position_LCB  1    0.0000 0.2481 -105806
## - Position_RWB  1    0.0000 0.2481 -105806
## - Position_CDM  1    0.0000 0.2481 -105806
## - Position_ST   1    0.0000 0.2481 -105806
## - Position_CM   1    0.0000 0.2481 -105806
## - Position_LB   1    0.0000 0.2481 -105806
## - Position_RS   1    0.0000 0.2481 -105806
## - Position_RB   1    0.0000 0.2481 -105806
## - Position_LDM  1    0.0000 0.2481 -105806
## - Position_LM   1    0.0000 0.2481 -105806
## - Position_LAM  1    0.0000 0.2481 -105806
## - Position_LW   1    0.0000 0.2481 -105805
## - Position_CAM  1    0.0000 0.2481 -105805
## - Position_RDM  1    0.0000 0.2481 -105805
## - Position_CF   1    0.0000 0.2482 -105804
## - Position_RW   1    0.0000 0.2482 -105804
## <none>                      0.2481 -105804
## - Position_LS   1    0.0001 0.2482 -105803
## - Position_RF   1    0.0001 0.2483 -105800
## - newwage       1    0.0003 0.2484 -105793
## - newpotential  1    0.0012 0.2493 -105757
## - newoverall    1    0.0025 0.2507 -105704
## - Position_LF   1    0.0036 0.2517 -105661
## - newvalue      1    4.7364 4.9845  -75849
## 
## Step:  AIC=-105805.8
## newreleaseclause ~ Position_RF + Position_ST + Position_LW + 
##     Position_GK + Position_RCM + Position_LF + Position_RS + 
##     Position_RCB + Position_LCM + Position_CB + Position_LDM + 
##     Position_CAM + Position_CDM + Position_LS + Position_LCB + 
##     Position_RM + Position_LAM + Position_LM + Position_LB + 
##     Position_RDM + Position_RW + Position_CM + Position_RB + 
##     Position_CF + Position_RWB + newage + newoverall + newpotential + 
##     newvalue + newwage
## 
##                Df Sum of Sq    RSS     AIC
## - Position_LCM  1    0.0000 0.2481 -105808
## - Position_RCM  1    0.0000 0.2481 -105808
## - newage        1    0.0000 0.2481 -105808
## - Position_RCB  1    0.0000 0.2481 -105808
## - Position_GK   1    0.0000 0.2481 -105808
## - Position_CB   1    0.0000 0.2481 -105808
## - Position_RM   1    0.0000 0.2481 -105808
## - Position_LCB  1    0.0000 0.2481 -105808
## - Position_RWB  1    0.0000 0.2481 -105808
## - Position_CDM  1    0.0000 0.2481 -105808
## - Position_ST   1    0.0000 0.2481 -105808
## - Position_CM   1    0.0000 0.2481 -105808
## - Position_LB   1    0.0000 0.2481 -105808
## - Position_RS   1    0.0000 0.2481 -105808
## - Position_RB   1    0.0000 0.2481 -105808
## - Position_LDM  1    0.0000 0.2481 -105808
## - Position_LM   1    0.0000 0.2481 -105808
## - Position_LAM  1    0.0000 0.2481 -105807
## - Position_LW   1    0.0000 0.2481 -105807
## - Position_CAM  1    0.0000 0.2481 -105807
## - Position_RDM  1    0.0000 0.2481 -105807
## - Position_CF   1    0.0000 0.2482 -105806
## - Position_RW   1    0.0000 0.2482 -105806
## <none>                      0.2481 -105806
## - Position_LS   1    0.0001 0.2482 -105804
## - Position_RF   1    0.0002 0.2483 -105802
## - newwage       1    0.0003 0.2484 -105795
## - newpotential  1    0.0012 0.2493 -105759
## - newoverall    1    0.0025 0.2507 -105706
## - Position_LF   1    0.0037 0.2518 -105659
## - newvalue      1    4.7367 4.9848  -75850
## 
## Step:  AIC=-105807.8
## newreleaseclause ~ Position_RF + Position_ST + Position_LW + 
##     Position_GK + Position_RCM + Position_LF + Position_RS + 
##     Position_RCB + Position_CB + Position_LDM + Position_CAM + 
##     Position_CDM + Position_LS + Position_LCB + Position_RM + 
##     Position_LAM + Position_LM + Position_LB + Position_RDM + 
##     Position_RW + Position_CM + Position_RB + Position_CF + Position_RWB + 
##     newage + newoverall + newpotential + newvalue + newwage
## 
##                Df Sum of Sq    RSS     AIC
## - Position_RCM  1    0.0000 0.2481 -105810
## - newage        1    0.0000 0.2481 -105810
## - Position_RM   1    0.0000 0.2481 -105810
## - Position_RWB  1    0.0000 0.2481 -105810
## - Position_ST   1    0.0000 0.2481 -105810
## - Position_RS   1    0.0000 0.2481 -105810
## - Position_CM   1    0.0000 0.2481 -105810
## - Position_RCB  1    0.0000 0.2481 -105810
## - Position_GK   1    0.0000 0.2481 -105809
## - Position_LCB  1    0.0000 0.2481 -105809
## - Position_CB   1    0.0000 0.2481 -105809
## - Position_LDM  1    0.0000 0.2481 -105809
## - Position_LAM  1    0.0000 0.2481 -105809
## - Position_CDM  1    0.0000 0.2481 -105809
## - Position_LB   1    0.0000 0.2481 -105809
## - Position_RB   1    0.0000 0.2481 -105809
## - Position_LM   1    0.0000 0.2482 -105809
## - Position_LW   1    0.0000 0.2482 -105808
## - Position_CF   1    0.0000 0.2482 -105808
## - Position_CAM  1    0.0000 0.2482 -105808
## <none>                      0.2481 -105808
## - Position_RDM  1    0.0001 0.2482 -105808
## - Position_RW   1    0.0001 0.2482 -105806
## - Position_LS   1    0.0002 0.2483 -105804
## - Position_RF   1    0.0002 0.2483 -105803
## - newwage       1    0.0003 0.2484 -105797
## - newpotential  1    0.0012 0.2493 -105761
## - newoverall    1    0.0025 0.2507 -105708
## - Position_LF   1    0.0040 0.2521 -105651
## - newvalue      1    4.7379 4.9860  -75850
## 
## Step:  AIC=-105809.7
## newreleaseclause ~ Position_RF + Position_ST + Position_LW + 
##     Position_GK + Position_LF + Position_RS + Position_RCB + 
##     Position_CB + Position_LDM + Position_CAM + Position_CDM + 
##     Position_LS + Position_LCB + Position_RM + Position_LAM + 
##     Position_LM + Position_LB + Position_RDM + Position_RW + 
##     Position_CM + Position_RB + Position_CF + Position_RWB + 
##     newage + newoverall + newpotential + newvalue + newwage
## 
##                Df Sum of Sq    RSS     AIC
## - newage        1    0.0000 0.2481 -105812
## - Position_RM   1    0.0000 0.2481 -105812
## - Position_RWB  1    0.0000 0.2481 -105812
## - Position_RS   1    0.0000 0.2481 -105812
## - Position_ST   1    0.0000 0.2481 -105812
## - Position_CM   1    0.0000 0.2481 -105811
## - Position_RCB  1    0.0000 0.2481 -105811
## - Position_LDM  1    0.0000 0.2481 -105811
## - Position_LAM  1    0.0000 0.2481 -105811
## - Position_LCB  1    0.0000 0.2481 -105811
## - Position_GK   1    0.0000 0.2481 -105811
## - Position_CB   1    0.0000 0.2481 -105811
## - Position_CDM  1    0.0000 0.2481 -105811
## - Position_LB   1    0.0000 0.2482 -105811
## - Position_RB   1    0.0000 0.2482 -105810
## - Position_LW   1    0.0000 0.2482 -105810
## - Position_LM   1    0.0000 0.2482 -105810
## - Position_CF   1    0.0000 0.2482 -105810
## <none>                      0.2481 -105810
## - Position_CAM  1    0.0001 0.2482 -105809
## - Position_RDM  1    0.0001 0.2482 -105809
## - Position_RW   1    0.0001 0.2482 -105807
## - Position_RF   1    0.0002 0.2483 -105805
## - Position_LS   1    0.0002 0.2483 -105805
## - newwage       1    0.0003 0.2484 -105799
## - newpotential  1    0.0012 0.2493 -105763
## - newoverall    1    0.0025 0.2507 -105710
## - Position_LF   1    0.0040 0.2521 -105652
## - newvalue      1    4.7379 4.9861  -75852
## 
## Step:  AIC=-105811.7
## newreleaseclause ~ Position_RF + Position_ST + Position_LW + 
##     Position_GK + Position_LF + Position_RS + Position_RCB + 
##     Position_CB + Position_LDM + Position_CAM + Position_CDM + 
##     Position_LS + Position_LCB + Position_RM + Position_LAM + 
##     Position_LM + Position_LB + Position_RDM + Position_RW + 
##     Position_CM + Position_RB + Position_CF + Position_RWB + 
##     newoverall + newpotential + newvalue + newwage
## 
##                Df Sum of Sq    RSS     AIC
## - Position_RM   1    0.0000 0.2481 -105814
## - Position_RWB  1    0.0000 0.2481 -105814
## - Position_RS   1    0.0000 0.2481 -105814
## - Position_ST   1    0.0000 0.2481 -105813
## - Position_CM   1    0.0000 0.2481 -105813
## - Position_RCB  1    0.0000 0.2481 -105813
## - Position_LDM  1    0.0000 0.2481 -105813
## - Position_LAM  1    0.0000 0.2481 -105813
## - Position_LCB  1    0.0000 0.2481 -105813
## - Position_GK   1    0.0000 0.2481 -105813
## - Position_CB   1    0.0000 0.2481 -105813
## - Position_CDM  1    0.0000 0.2481 -105813
## - Position_LB   1    0.0000 0.2482 -105813
## - Position_RB   1    0.0000 0.2482 -105812
## - Position_LW   1    0.0000 0.2482 -105812
## - Position_LM   1    0.0000 0.2482 -105812
## - Position_CF   1    0.0000 0.2482 -105812
## <none>                      0.2481 -105812
## - Position_CAM  1    0.0001 0.2482 -105811
## - Position_RDM  1    0.0001 0.2482 -105811
## - Position_RW   1    0.0001 0.2482 -105809
## - Position_RF   1    0.0002 0.2483 -105807
## - Position_LS   1    0.0002 0.2483 -105807
## - newwage       1    0.0003 0.2485 -105800
## - newpotential  1    0.0034 0.2515 -105677
## - Position_LF   1    0.0040 0.2521 -105654
## - newoverall    1    0.0085 0.2566 -105477
## - newvalue      1    4.9277 5.1759  -75481
## 
## Step:  AIC=-105813.7
## newreleaseclause ~ Position_RF + Position_ST + Position_LW + 
##     Position_GK + Position_LF + Position_RS + Position_RCB + 
##     Position_CB + Position_LDM + Position_CAM + Position_CDM + 
##     Position_LS + Position_LCB + Position_LAM + Position_LM + 
##     Position_LB + Position_RDM + Position_RW + Position_CM + 
##     Position_RB + Position_CF + Position_RWB + newoverall + newpotential + 
##     newvalue + newwage
## 
##                Df Sum of Sq    RSS     AIC
## - Position_RS   1    0.0000 0.2481 -105816
## - Position_ST   1    0.0000 0.2481 -105815
## - Position_RWB  1    0.0000 0.2481 -105815
## - Position_CM   1    0.0000 0.2481 -105815
## - Position_LDM  1    0.0000 0.2481 -105815
## - Position_LAM  1    0.0000 0.2481 -105815
## - Position_RCB  1    0.0000 0.2481 -105815
## - Position_LCB  1    0.0000 0.2481 -105815
## - Position_GK   1    0.0000 0.2482 -105815
## - Position_CB   1    0.0000 0.2482 -105815
## - Position_CDM  1    0.0000 0.2482 -105814
## - Position_CF   1    0.0000 0.2482 -105814
## <none>                      0.2481 -105814
## - Position_LW   1    0.0001 0.2482 -105813
## - Position_LB   1    0.0001 0.2482 -105813
## - Position_RB   1    0.0001 0.2482 -105813
## - Position_CAM  1    0.0001 0.2482 -105813
## - Position_LM   1    0.0001 0.2482 -105813
## - Position_RDM  1    0.0001 0.2482 -105812
## - Position_RW   1    0.0001 0.2482 -105811
## - Position_RF   1    0.0002 0.2483 -105809
## - Position_LS   1    0.0002 0.2483 -105808
## - newwage       1    0.0003 0.2485 -105802
## - newpotential  1    0.0034 0.2515 -105679
## - Position_LF   1    0.0040 0.2521 -105655
## - newoverall    1    0.0085 0.2566 -105479
## - newvalue      1    4.9278 5.1759  -75483
## 
## Step:  AIC=-105815.5
## newreleaseclause ~ Position_RF + Position_ST + Position_LW + 
##     Position_GK + Position_LF + Position_RCB + Position_CB + 
##     Position_LDM + Position_CAM + Position_CDM + Position_LS + 
##     Position_LCB + Position_LAM + Position_LM + Position_LB + 
##     Position_RDM + Position_RW + Position_CM + Position_RB + 
##     Position_CF + Position_RWB + newoverall + newpotential + 
##     newvalue + newwage
## 
##                Df Sum of Sq    RSS     AIC
## - Position_ST   1    0.0000 0.2481 -105817
## - Position_CM   1    0.0000 0.2481 -105817
## - Position_RWB  1    0.0000 0.2481 -105817
## - Position_LDM  1    0.0000 0.2481 -105817
## - Position_LAM  1    0.0000 0.2481 -105817
## - Position_RCB  1    0.0000 0.2481 -105817
## - Position_LCB  1    0.0000 0.2481 -105817
## - Position_GK   1    0.0000 0.2482 -105816
## - Position_CB   1    0.0000 0.2482 -105816
## - Position_CDM  1    0.0000 0.2482 -105816
## - Position_CF   1    0.0000 0.2482 -105816
## <none>                      0.2481 -105816
## - Position_LW   1    0.0001 0.2482 -105815
## - Position_LB   1    0.0001 0.2482 -105815
## - Position_RB   1    0.0001 0.2482 -105815
## - Position_CAM  1    0.0001 0.2482 -105815
## - Position_RDM  1    0.0001 0.2482 -105814
## - Position_LM   1    0.0001 0.2482 -105814
## - Position_RW   1    0.0001 0.2482 -105813
## - Position_RF   1    0.0002 0.2483 -105811
## - Position_LS   1    0.0002 0.2483 -105810
## - newwage       1    0.0003 0.2485 -105804
## - newpotential  1    0.0034 0.2516 -105681
## - Position_LF   1    0.0040 0.2521 -105657
## - newoverall    1    0.0085 0.2567 -105480
## - newvalue      1    4.9279 5.1760  -75485
## 
## Step:  AIC=-105817.4
## newreleaseclause ~ Position_RF + Position_LW + Position_GK + 
##     Position_LF + Position_RCB + Position_CB + Position_LDM + 
##     Position_CAM + Position_CDM + Position_LS + Position_LCB + 
##     Position_LAM + Position_LM + Position_LB + Position_RDM + 
##     Position_RW + Position_CM + Position_RB + Position_CF + Position_RWB + 
##     newoverall + newpotential + newvalue + newwage
## 
##                Df Sum of Sq    RSS     AIC
## - Position_CM   1    0.0000 0.2481 -105819
## - Position_LDM  1    0.0000 0.2481 -105819
## - Position_RWB  1    0.0000 0.2481 -105819
## - Position_LAM  1    0.0000 0.2481 -105819
## - Position_RCB  1    0.0000 0.2482 -105818
## - Position_LCB  1    0.0000 0.2482 -105818
## - Position_CF   1    0.0000 0.2482 -105818
## <none>                      0.2481 -105817
## - Position_CDM  1    0.0001 0.2482 -105817
## - Position_CB   1    0.0001 0.2482 -105817
## - Position_GK   1    0.0001 0.2482 -105817
## - Position_CAM  1    0.0001 0.2482 -105817
## - Position_LW   1    0.0001 0.2482 -105817
## - Position_LB   1    0.0001 0.2482 -105816
## - Position_RDM  1    0.0001 0.2482 -105816
## - Position_RB   1    0.0001 0.2482 -105815
## - Position_LM   1    0.0001 0.2482 -105815
## - Position_RW   1    0.0001 0.2483 -105815
## - Position_RF   1    0.0002 0.2483 -105813
## - Position_LS   1    0.0002 0.2483 -105812
## - newwage       1    0.0003 0.2485 -105806
## - newpotential  1    0.0034 0.2516 -105683
## - Position_LF   1    0.0040 0.2522 -105659
## - newoverall    1    0.0086 0.2567 -105480
## - newvalue      1    4.9339 5.1820  -75475
## 
## Step:  AIC=-105819.3
## newreleaseclause ~ Position_RF + Position_LW + Position_GK + 
##     Position_LF + Position_RCB + Position_CB + Position_LDM + 
##     Position_CAM + Position_CDM + Position_LS + Position_LCB + 
##     Position_LAM + Position_LM + Position_LB + Position_RDM + 
##     Position_RW + Position_RB + Position_CF + Position_RWB + 
##     newoverall + newpotential + newvalue + newwage
## 
##                Df Sum of Sq    RSS     AIC
## - Position_LDM  1    0.0000 0.2481 -105821
## - Position_RWB  1    0.0000 0.2481 -105821
## - Position_LAM  1    0.0000 0.2481 -105821
## - Position_RCB  1    0.0000 0.2482 -105820
## - Position_LCB  1    0.0000 0.2482 -105820
## - Position_CF   1    0.0000 0.2482 -105820
## <none>                      0.2481 -105819
## - Position_CDM  1    0.0001 0.2482 -105819
## - Position_CAM  1    0.0001 0.2482 -105819
## - Position_CB   1    0.0001 0.2482 -105818
## - Position_GK   1    0.0001 0.2482 -105818
## - Position_LW   1    0.0001 0.2482 -105818
## - Position_RDM  1    0.0001 0.2482 -105817
## - Position_LB   1    0.0001 0.2482 -105817
## - Position_RB   1    0.0001 0.2482 -105817
## - Position_RW   1    0.0001 0.2483 -105817
## - Position_LM   1    0.0001 0.2483 -105816
## - Position_RF   1    0.0002 0.2483 -105815
## - Position_LS   1    0.0002 0.2483 -105814
## - newwage       1    0.0003 0.2485 -105808
## - newpotential  1    0.0034 0.2516 -105685
## - Position_LF   1    0.0040 0.2522 -105661
## - newoverall    1    0.0086 0.2568 -105479
## - newvalue      1    4.9341 5.1822  -75477
## 
## Step:  AIC=-105821.1
## newreleaseclause ~ Position_RF + Position_LW + Position_GK + 
##     Position_LF + Position_RCB + Position_CB + Position_CAM + 
##     Position_CDM + Position_LS + Position_LCB + Position_LAM + 
##     Position_LM + Position_LB + Position_RDM + Position_RW + 
##     Position_RB + Position_CF + Position_RWB + newoverall + newpotential + 
##     newvalue + newwage
## 
##                Df Sum of Sq    RSS     AIC
## - Position_RWB  1    0.0000 0.2481 -105823
## - Position_LAM  1    0.0000 0.2481 -105823
## - Position_RCB  1    0.0000 0.2482 -105822
## - Position_LCB  1    0.0000 0.2482 -105822
## - Position_CF   1    0.0000 0.2482 -105821
## <none>                      0.2481 -105821
## - Position_CAM  1    0.0001 0.2482 -105820
## - Position_CDM  1    0.0001 0.2482 -105820
## - Position_CB   1    0.0001 0.2482 -105820
## - Position_LW   1    0.0001 0.2482 -105820
## - Position_GK   1    0.0001 0.2482 -105820
## - Position_RDM  1    0.0001 0.2482 -105819
## - Position_LB   1    0.0001 0.2482 -105819
## - Position_RW   1    0.0001 0.2483 -105818
## - Position_RB   1    0.0001 0.2483 -105818
## - Position_LM   1    0.0001 0.2483 -105818
## - Position_RF   1    0.0002 0.2483 -105817
## - Position_LS   1    0.0002 0.2483 -105816
## - newwage       1    0.0003 0.2485 -105810
## - newpotential  1    0.0034 0.2516 -105686
## - Position_LF   1    0.0040 0.2522 -105663
## - newoverall    1    0.0087 0.2568 -105480
## - newvalue      1    4.9341 5.1822  -75479
## 
## Step:  AIC=-105822.8
## newreleaseclause ~ Position_RF + Position_LW + Position_GK + 
##     Position_LF + Position_RCB + Position_CB + Position_CAM + 
##     Position_CDM + Position_LS + Position_LCB + Position_LAM + 
##     Position_LM + Position_LB + Position_RDM + Position_RW + 
##     Position_RB + Position_CF + newoverall + newpotential + newvalue + 
##     newwage
## 
##                Df Sum of Sq    RSS     AIC
## - Position_LAM  1    0.0000 0.2482 -105824
## - Position_RCB  1    0.0000 0.2482 -105824
## - Position_LCB  1    0.0000 0.2482 -105823
## - Position_CF   1    0.0000 0.2482 -105823
## <none>                      0.2481 -105823
## - Position_CDM  1    0.0001 0.2482 -105822
## - Position_CAM  1    0.0001 0.2482 -105822
## - Position_CB   1    0.0001 0.2482 -105822
## - Position_GK   1    0.0001 0.2482 -105822
## - Position_LW   1    0.0001 0.2482 -105822
## - Position_RDM  1    0.0001 0.2482 -105821
## - Position_LB   1    0.0001 0.2483 -105820
## - Position_RB   1    0.0001 0.2483 -105820
## - Position_RW   1    0.0001 0.2483 -105820
## - Position_LM   1    0.0001 0.2483 -105819
## - Position_RF   1    0.0002 0.2483 -105818
## - Position_LS   1    0.0002 0.2483 -105818
## - newwage       1    0.0003 0.2485 -105811
## - newpotential  1    0.0034 0.2516 -105688
## - Position_LF   1    0.0040 0.2522 -105664
## - newoverall    1    0.0087 0.2568 -105481
## - newvalue      1    4.9344 5.1825  -75480
## 
## Step:  AIC=-105824.3
## newreleaseclause ~ Position_RF + Position_LW + Position_GK + 
##     Position_LF + Position_RCB + Position_CB + Position_CAM + 
##     Position_CDM + Position_LS + Position_LCB + Position_LM + 
##     Position_LB + Position_RDM + Position_RW + Position_RB + 
##     Position_CF + newoverall + newpotential + newvalue + newwage
## 
##                Df Sum of Sq    RSS     AIC
## - Position_RCB  1    0.0000 0.2482 -105825
## - Position_LCB  1    0.0000 0.2482 -105825
## - Position_CF   1    0.0000 0.2482 -105825
## <none>                      0.2482 -105824
## - Position_CDM  1    0.0001 0.2482 -105824
## - Position_CAM  1    0.0001 0.2482 -105824
## - Position_CB   1    0.0001 0.2482 -105823
## - Position_GK   1    0.0001 0.2482 -105823
## - Position_LW   1    0.0001 0.2482 -105823
## - Position_RDM  1    0.0001 0.2483 -105822
## - Position_LB   1    0.0001 0.2483 -105822
## - Position_RB   1    0.0001 0.2483 -105822
## - Position_RW   1    0.0001 0.2483 -105822
## - Position_LM   1    0.0001 0.2483 -105821
## - Position_RF   1    0.0002 0.2483 -105820
## - Position_LS   1    0.0002 0.2483 -105819
## - newwage       1    0.0003 0.2485 -105813
## - newpotential  1    0.0034 0.2516 -105690
## - Position_LF   1    0.0040 0.2522 -105666
## - newoverall    1    0.0087 0.2568 -105483
## - newvalue      1    4.9360 5.1842  -75479
## 
## Step:  AIC=-105825.2
## newreleaseclause ~ Position_RF + Position_LW + Position_GK + 
##     Position_LF + Position_CB + Position_CAM + Position_CDM + 
##     Position_LS + Position_LCB + Position_LM + Position_LB + 
##     Position_RDM + Position_RW + Position_RB + Position_CF + 
##     newoverall + newpotential + newvalue + newwage
## 
##                Df Sum of Sq    RSS     AIC
## - Position_LCB  1    0.0000 0.2482 -105826
## - Position_CF   1    0.0000 0.2482 -105825
## <none>                      0.2482 -105825
## - Position_CDM  1    0.0001 0.2482 -105825
## - Position_CB   1    0.0001 0.2482 -105825
## - Position_GK   1    0.0001 0.2482 -105825
## - Position_LW   1    0.0001 0.2483 -105824
## - Position_CAM  1    0.0001 0.2483 -105824
## - Position_RDM  1    0.0001 0.2483 -105823
## - Position_LB   1    0.0001 0.2483 -105823
## - Position_RB   1    0.0001 0.2483 -105823
## - Position_LM   1    0.0001 0.2483 -105822
## - Position_RW   1    0.0001 0.2483 -105822
## - Position_RF   1    0.0002 0.2483 -105821
## - Position_LS   1    0.0002 0.2484 -105820
## - newwage       1    0.0003 0.2485 -105814
## - newpotential  1    0.0034 0.2516 -105691
## - Position_LF   1    0.0040 0.2522 -105666
## - newoverall    1    0.0087 0.2568 -105485
## - newvalue      1    4.9413 5.1894  -75471
## 
## Step:  AIC=-105826
## newreleaseclause ~ Position_RF + Position_LW + Position_GK + 
##     Position_LF + Position_CB + Position_CAM + Position_CDM + 
##     Position_LS + Position_LM + Position_LB + Position_RDM + 
##     Position_RW + Position_RB + Position_CF + newoverall + newpotential + 
##     newvalue + newwage
## 
##                Df Sum of Sq    RSS     AIC
## - Position_CF   1    0.0000 0.2483 -105826
## - Position_CDM  1    0.0000 0.2483 -105826
## - Position_CB   1    0.0000 0.2483 -105826
## <none>                      0.2482 -105826
## - Position_GK   1    0.0001 0.2483 -105826
## - Position_LW   1    0.0001 0.2483 -105825
## - Position_LB   1    0.0001 0.2483 -105825
## - Position_RDM  1    0.0001 0.2483 -105825
## - Position_RB   1    0.0001 0.2483 -105824
## - Position_CAM  1    0.0001 0.2483 -105824
## - Position_LM   1    0.0001 0.2483 -105824
## - Position_RW   1    0.0001 0.2484 -105823
## - Position_RF   1    0.0002 0.2484 -105821
## - Position_LS   1    0.0002 0.2484 -105820
## - newwage       1    0.0003 0.2486 -105814
## - newpotential  1    0.0034 0.2516 -105692
## - Position_LF   1    0.0040 0.2522 -105667
## - newoverall    1    0.0086 0.2569 -105486
## - newvalue      1    4.9435 5.1917  -75468
## 
## Step:  AIC=-105826.2
## newreleaseclause ~ Position_RF + Position_LW + Position_GK + 
##     Position_LF + Position_CB + Position_CAM + Position_CDM + 
##     Position_LS + Position_LM + Position_LB + Position_RDM + 
##     Position_RW + Position_RB + newoverall + newpotential + newvalue + 
##     newwage
## 
##                Df Sum of Sq    RSS     AIC
## <none>                      0.2483 -105826
## - Position_CDM  1    0.0001 0.2483 -105826
## - Position_CB   1    0.0001 0.2483 -105826
## - Position_GK   1    0.0001 0.2483 -105826
## - Position_LW   1    0.0001 0.2483 -105826
## - Position_CAM  1    0.0001 0.2483 -105825
## - Position_LB   1    0.0001 0.2483 -105825
## - Position_RDM  1    0.0001 0.2483 -105825
## - Position_RB   1    0.0001 0.2484 -105824
## - Position_LM   1    0.0001 0.2484 -105824
## - Position_RW   1    0.0001 0.2484 -105823
## - Position_RF   1    0.0002 0.2484 -105822
## - Position_LS   1    0.0002 0.2485 -105820
## - newwage       1    0.0003 0.2486 -105815
## - newpotential  1    0.0034 0.2516 -105693
## - Position_LF   1    0.0040 0.2523 -105667
## - newoverall    1    0.0086 0.2569 -105488
## - newvalue      1    4.9437 5.1920  -75470
## Warning in leaps.setup(x, y, wt = wt, nbest = nbest, nvmax = nvmax,
## force.in = force.in, : 2 linear dependencies found
## Reordering variables and trying again:
## Subset selection object
## Call: regsubsets.formula(newreleaseclause ~ ., data = releasetrain, 
##     nbest = 1, nvmax = dim(releasetrain)[2], method = "exhaustive")
## 33 Variables  (and intercept)
##              Forced in Forced out
## Position_RF      FALSE      FALSE
## Position_ST      FALSE      FALSE
## Position_LW      FALSE      FALSE
## Position_GK      FALSE      FALSE
## Position_RCM     FALSE      FALSE
## Position_LF      FALSE      FALSE
## Position_RS      FALSE      FALSE
## Position_RCB     FALSE      FALSE
## Position_LCM     FALSE      FALSE
## Position_CB      FALSE      FALSE
## Position_LDM     FALSE      FALSE
## Position_CAM     FALSE      FALSE
## Position_CDM     FALSE      FALSE
## Position_LS      FALSE      FALSE
## Position_LCB     FALSE      FALSE
## Position_RM      FALSE      FALSE
## Position_LAM     FALSE      FALSE
## Position_LM      FALSE      FALSE
## Position_LB      FALSE      FALSE
## Position_RDM     FALSE      FALSE
## Position_RW      FALSE      FALSE
## Position_CM      FALSE      FALSE
## Position_RB      FALSE      FALSE
## Position_RAM     FALSE      FALSE
## Position_CF      FALSE      FALSE
## Position_RWB     FALSE      FALSE
## newage           FALSE      FALSE
## newoverall       FALSE      FALSE
## newpotential     FALSE      FALSE
## newvalue         FALSE      FALSE
## newwage          FALSE      FALSE
## Position_LWB     FALSE      FALSE
## Position_        FALSE      FALSE
## 1 subsets of each size up to 31
## Selection Algorithm: exhaustive
##           Position_RF Position_ST Position_LW Position_GK Position_RCM
## 1  ( 1 )  " "         " "         " "         " "         " "         
## 2  ( 1 )  " "         " "         " "         " "         " "         
## 3  ( 1 )  " "         " "         " "         " "         " "         
## 4  ( 1 )  " "         " "         " "         " "         " "         
## 5  ( 1 )  " "         " "         " "         " "         " "         
## 6  ( 1 )  " "         " "         " "         " "         " "         
## 7  ( 1 )  " "         " "         " "         " "         " "         
## 8  ( 1 )  " "         " "         " "         " "         " "         
## 9  ( 1 )  "*"         " "         " "         " "         " "         
## 10  ( 1 ) "*"         "*"         " "         " "         " "         
## 11  ( 1 ) "*"         "*"         " "         " "         " "         
## 12  ( 1 ) "*"         "*"         " "         " "         " "         
## 13  ( 1 ) "*"         "*"         " "         " "         " "         
## 14  ( 1 ) "*"         "*"         " "         " "         " "         
## 15  ( 1 ) "*"         "*"         " "         " "         " "         
## 16  ( 1 ) "*"         "*"         " "         " "         " "         
## 17  ( 1 ) "*"         "*"         " "         " "         " "         
## 18  ( 1 ) "*"         "*"         "*"         " "         " "         
## 19  ( 1 ) "*"         "*"         "*"         " "         "*"         
## 20  ( 1 ) "*"         "*"         "*"         " "         "*"         
## 21  ( 1 ) "*"         "*"         "*"         " "         "*"         
## 22  ( 1 ) "*"         "*"         "*"         " "         "*"         
## 23  ( 1 ) "*"         "*"         "*"         " "         "*"         
## 24  ( 1 ) "*"         "*"         "*"         " "         "*"         
## 25  ( 1 ) "*"         "*"         "*"         " "         "*"         
## 26  ( 1 ) "*"         "*"         "*"         " "         "*"         
## 27  ( 1 ) "*"         "*"         "*"         " "         "*"         
## 28  ( 1 ) "*"         "*"         "*"         " "         "*"         
## 29  ( 1 ) "*"         "*"         "*"         " "         "*"         
## 30  ( 1 ) "*"         "*"         "*"         "*"         " "         
## 31  ( 1 ) "*"         "*"         "*"         "*"         "*"         
##           Position_LF Position_RS Position_RCB Position_LCM Position_CB
## 1  ( 1 )  " "         " "         " "          " "          " "        
## 2  ( 1 )  " "         " "         " "          " "          " "        
## 3  ( 1 )  "*"         " "         " "          " "          " "        
## 4  ( 1 )  "*"         " "         " "          " "          " "        
## 5  ( 1 )  "*"         " "         " "          " "          " "        
## 6  ( 1 )  "*"         " "         " "          " "          " "        
## 7  ( 1 )  "*"         " "         " "          " "          " "        
## 8  ( 1 )  "*"         " "         " "          " "          " "        
## 9  ( 1 )  "*"         " "         " "          " "          " "        
## 10  ( 1 ) "*"         " "         " "          " "          " "        
## 11  ( 1 ) "*"         " "         " "          " "          " "        
## 12  ( 1 ) "*"         " "         " "          " "          " "        
## 13  ( 1 ) "*"         " "         " "          " "          " "        
## 14  ( 1 ) "*"         " "         " "          " "          " "        
## 15  ( 1 ) "*"         " "         " "          " "          " "        
## 16  ( 1 ) "*"         "*"         " "          " "          " "        
## 17  ( 1 ) "*"         "*"         " "          " "          " "        
## 18  ( 1 ) "*"         "*"         " "          " "          " "        
## 19  ( 1 ) "*"         "*"         " "          " "          " "        
## 20  ( 1 ) "*"         "*"         " "          "*"          " "        
## 21  ( 1 ) "*"         "*"         " "          "*"          " "        
## 22  ( 1 ) "*"         "*"         " "          "*"          " "        
## 23  ( 1 ) "*"         "*"         " "          "*"          " "        
## 24  ( 1 ) "*"         "*"         " "          "*"          " "        
## 25  ( 1 ) "*"         "*"         " "          "*"          " "        
## 26  ( 1 ) "*"         "*"         " "          "*"          " "        
## 27  ( 1 ) "*"         "*"         " "          "*"          " "        
## 28  ( 1 ) "*"         "*"         " "          "*"          " "        
## 29  ( 1 ) "*"         "*"         " "          "*"          " "        
## 30  ( 1 ) "*"         "*"         "*"          " "          "*"        
## 31  ( 1 ) "*"         "*"         "*"          "*"          "*"        
##           Position_LDM Position_CAM Position_CDM Position_LS Position_LCB
## 1  ( 1 )  " "          " "          " "          " "         " "         
## 2  ( 1 )  " "          " "          " "          " "         " "         
## 3  ( 1 )  " "          " "          " "          " "         " "         
## 4  ( 1 )  " "          " "          " "          " "         " "         
## 5  ( 1 )  " "          " "          " "          " "         " "         
## 6  ( 1 )  " "          " "          " "          "*"         " "         
## 7  ( 1 )  " "          " "          " "          "*"         " "         
## 8  ( 1 )  " "          "*"          " "          "*"         " "         
## 9  ( 1 )  " "          "*"          " "          "*"         " "         
## 10  ( 1 ) " "          "*"          " "          "*"         " "         
## 11  ( 1 ) " "          "*"          " "          "*"         " "         
## 12  ( 1 ) " "          "*"          " "          "*"         " "         
## 13  ( 1 ) " "          "*"          " "          "*"         " "         
## 14  ( 1 ) " "          "*"          " "          "*"         " "         
## 15  ( 1 ) "*"          "*"          " "          "*"         " "         
## 16  ( 1 ) "*"          "*"          " "          "*"         " "         
## 17  ( 1 ) "*"          "*"          " "          "*"         " "         
## 18  ( 1 ) "*"          "*"          " "          "*"         " "         
## 19  ( 1 ) "*"          "*"          " "          "*"         " "         
## 20  ( 1 ) "*"          "*"          " "          "*"         " "         
## 21  ( 1 ) "*"          "*"          " "          "*"         " "         
## 22  ( 1 ) "*"          "*"          " "          "*"         " "         
## 23  ( 1 ) "*"          "*"          " "          "*"         " "         
## 24  ( 1 ) "*"          "*"          "*"          "*"         " "         
## 25  ( 1 ) "*"          "*"          "*"          "*"         " "         
## 26  ( 1 ) "*"          "*"          "*"          "*"         " "         
## 27  ( 1 ) "*"          "*"          "*"          "*"         " "         
## 28  ( 1 ) "*"          "*"          "*"          "*"         "*"         
## 29  ( 1 ) "*"          "*"          "*"          "*"         "*"         
## 30  ( 1 ) "*"          "*"          "*"          "*"         "*"         
## 31  ( 1 ) "*"          "*"          "*"          "*"         "*"         
##           Position_RM Position_LAM Position_LM Position_LB Position_RDM
## 1  ( 1 )  " "         " "          " "         " "         " "         
## 2  ( 1 )  " "         " "          " "         " "         " "         
## 3  ( 1 )  " "         " "          " "         " "         " "         
## 4  ( 1 )  " "         " "          " "         " "         " "         
## 5  ( 1 )  " "         " "          " "         " "         " "         
## 6  ( 1 )  " "         " "          " "         " "         " "         
## 7  ( 1 )  " "         " "          " "         " "         " "         
## 8  ( 1 )  " "         " "          " "         " "         " "         
## 9  ( 1 )  " "         " "          " "         " "         " "         
## 10  ( 1 ) " "         " "          " "         " "         " "         
## 11  ( 1 ) " "         " "          " "         " "         " "         
## 12  ( 1 ) " "         " "          " "         " "         " "         
## 13  ( 1 ) "*"         " "          " "         " "         " "         
## 14  ( 1 ) "*"         " "          " "         " "         "*"         
## 15  ( 1 ) "*"         " "          " "         " "         "*"         
## 16  ( 1 ) "*"         " "          " "         " "         "*"         
## 17  ( 1 ) "*"         " "          "*"         " "         "*"         
## 18  ( 1 ) "*"         " "          "*"         " "         "*"         
## 19  ( 1 ) "*"         " "          "*"         " "         "*"         
## 20  ( 1 ) "*"         " "          "*"         " "         "*"         
## 21  ( 1 ) "*"         " "          "*"         " "         "*"         
## 22  ( 1 ) "*"         " "          "*"         "*"         "*"         
## 23  ( 1 ) "*"         "*"          "*"         "*"         "*"         
## 24  ( 1 ) "*"         "*"          "*"         "*"         "*"         
## 25  ( 1 ) "*"         "*"          "*"         "*"         "*"         
## 26  ( 1 ) "*"         "*"          "*"         "*"         "*"         
## 27  ( 1 ) "*"         "*"          "*"         "*"         "*"         
## 28  ( 1 ) "*"         "*"          "*"         "*"         "*"         
## 29  ( 1 ) "*"         "*"          "*"         "*"         "*"         
## 30  ( 1 ) "*"         "*"          "*"         "*"         "*"         
## 31  ( 1 ) "*"         "*"          "*"         "*"         "*"         
##           Position_RW Position_CM Position_RB Position_RAM Position_CF
## 1  ( 1 )  " "         " "         " "         " "          " "        
## 2  ( 1 )  " "         " "         " "         " "          " "        
## 3  ( 1 )  " "         " "         " "         " "          " "        
## 4  ( 1 )  " "         " "         " "         " "          " "        
## 5  ( 1 )  " "         " "         " "         " "          " "        
## 6  ( 1 )  " "         " "         " "         " "          " "        
## 7  ( 1 )  "*"         " "         " "         " "          " "        
## 8  ( 1 )  "*"         " "         " "         " "          " "        
## 9  ( 1 )  "*"         " "         " "         " "          " "        
## 10  ( 1 ) "*"         " "         " "         " "          " "        
## 11  ( 1 ) "*"         "*"         " "         " "          " "        
## 12  ( 1 ) "*"         "*"         " "         " "          "*"        
## 13  ( 1 ) "*"         "*"         " "         " "          "*"        
## 14  ( 1 ) "*"         "*"         " "         " "          "*"        
## 15  ( 1 ) "*"         "*"         " "         " "          "*"        
## 16  ( 1 ) "*"         "*"         " "         " "          "*"        
## 17  ( 1 ) "*"         "*"         " "         " "          "*"        
## 18  ( 1 ) "*"         "*"         " "         " "          "*"        
## 19  ( 1 ) "*"         "*"         " "         " "          "*"        
## 20  ( 1 ) "*"         "*"         " "         " "          "*"        
## 21  ( 1 ) "*"         "*"         "*"         " "          "*"        
## 22  ( 1 ) "*"         "*"         "*"         " "          "*"        
## 23  ( 1 ) "*"         "*"         "*"         " "          "*"        
## 24  ( 1 ) "*"         "*"         "*"         " "          "*"        
## 25  ( 1 ) "*"         "*"         "*"         " "          "*"        
## 26  ( 1 ) "*"         "*"         "*"         " "          "*"        
## 27  ( 1 ) "*"         "*"         "*"         " "          "*"        
## 28  ( 1 ) "*"         "*"         "*"         " "          "*"        
## 29  ( 1 ) "*"         "*"         "*"         "*"          "*"        
## 30  ( 1 ) "*"         "*"         "*"         "*"          "*"        
## 31  ( 1 ) "*"         "*"         "*"         "*"          "*"        
##           Position_RWB Position_LWB Position_ newage newoverall
## 1  ( 1 )  " "          " "          " "       " "    " "       
## 2  ( 1 )  " "          " "          " "       "*"    " "       
## 3  ( 1 )  " "          " "          " "       "*"    " "       
## 4  ( 1 )  " "          " "          " "       " "    "*"       
## 5  ( 1 )  " "          " "          " "       " "    "*"       
## 6  ( 1 )  " "          " "          " "       " "    "*"       
## 7  ( 1 )  " "          " "          " "       " "    "*"       
## 8  ( 1 )  " "          " "          " "       " "    "*"       
## 9  ( 1 )  " "          " "          " "       " "    "*"       
## 10  ( 1 ) " "          " "          " "       " "    "*"       
## 11  ( 1 ) " "          " "          " "       " "    "*"       
## 12  ( 1 ) " "          " "          " "       " "    "*"       
## 13  ( 1 ) " "          " "          " "       " "    "*"       
## 14  ( 1 ) " "          " "          " "       " "    "*"       
## 15  ( 1 ) " "          " "          " "       " "    "*"       
## 16  ( 1 ) " "          " "          " "       " "    "*"       
## 17  ( 1 ) " "          " "          " "       " "    "*"       
## 18  ( 1 ) " "          " "          " "       " "    "*"       
## 19  ( 1 ) " "          " "          " "       " "    "*"       
## 20  ( 1 ) " "          " "          " "       " "    "*"       
## 21  ( 1 ) " "          " "          " "       " "    "*"       
## 22  ( 1 ) " "          " "          " "       " "    "*"       
## 23  ( 1 ) " "          " "          " "       " "    "*"       
## 24  ( 1 ) " "          " "          " "       " "    "*"       
## 25  ( 1 ) " "          "*"          " "       " "    "*"       
## 26  ( 1 ) "*"          "*"          " "       " "    "*"       
## 27  ( 1 ) "*"          "*"          " "       "*"    "*"       
## 28  ( 1 ) "*"          "*"          " "       "*"    "*"       
## 29  ( 1 ) "*"          "*"          " "       "*"    "*"       
## 30  ( 1 ) "*"          "*"          " "       "*"    "*"       
## 31  ( 1 ) "*"          " "          " "       "*"    "*"       
##           newpotential newvalue newwage
## 1  ( 1 )  " "          "*"      " "    
## 2  ( 1 )  " "          "*"      " "    
## 3  ( 1 )  " "          "*"      " "    
## 4  ( 1 )  "*"          "*"      " "    
## 5  ( 1 )  "*"          "*"      "*"    
## 6  ( 1 )  "*"          "*"      "*"    
## 7  ( 1 )  "*"          "*"      "*"    
## 8  ( 1 )  "*"          "*"      "*"    
## 9  ( 1 )  "*"          "*"      "*"    
## 10  ( 1 ) "*"          "*"      "*"    
## 11  ( 1 ) "*"          "*"      "*"    
## 12  ( 1 ) "*"          "*"      "*"    
## 13  ( 1 ) "*"          "*"      "*"    
## 14  ( 1 ) "*"          "*"      "*"    
## 15  ( 1 ) "*"          "*"      "*"    
## 16  ( 1 ) "*"          "*"      "*"    
## 17  ( 1 ) "*"          "*"      "*"    
## 18  ( 1 ) "*"          "*"      "*"    
## 19  ( 1 ) "*"          "*"      "*"    
## 20  ( 1 ) "*"          "*"      "*"    
## 21  ( 1 ) "*"          "*"      "*"    
## 22  ( 1 ) "*"          "*"      "*"    
## 23  ( 1 ) "*"          "*"      "*"    
## 24  ( 1 ) "*"          "*"      "*"    
## 25  ( 1 ) "*"          "*"      "*"    
## 26  ( 1 ) "*"          "*"      "*"    
## 27  ( 1 ) "*"          "*"      "*"    
## 28  ( 1 ) "*"          "*"      "*"    
## 29  ( 1 ) "*"          "*"      "*"    
## 30  ( 1 ) "*"          "*"      "*"    
## 31  ( 1 ) "*"          "*"      "*"
## [1] 16

Evaluating the performance of a linear regression model

Results for the first linear regression model

##                ME    RMSE      MAE       MPE     MAPE
## Test set 18328.18 1207435 503140.8 -1.997648 33.82996
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
##  -5681381    539779   1118287   4488936   3365189 199131088

Results for the linear regression model with Backward elimination

##                ME    RMSE      MAE       MPE     MAPE
## Test set 18188.63 1207057 502836.7 -1.812386 33.84422
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
##  -5678445    539587   1117028   4489075   3358827 199098061

Results for the linear regression model with exhaustive elimination

##                ME    RMSE      MAE       MPE     MAPE
## Test set 18521.42 1207372 502209.2 -1.916571 33.86571
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
##  -5671427    535801   1116903   4488742   3356186 199151560
plot(releasestep)

It can be seen that the way to go will be to select the linear regression model with backward elimination method applied. The error can be due to the fact that the variables in this dataset are not exhaustive enough to accurately determine the Release clause value. The release clause value has a lot more factors attached to it such as the market conditions at the moment, the selling club’s financial conditions and the club’s positive/negative attitude on release clauses. As we can see from the plot’s some clubs such as FC Barcelona and Real Madrid have a tendency to have really high value release clauses. This can be seen from the regression model when plotted as players from these clubs tend to have the most impact in the model.

This further shows that to build an effective release clause predictor will need more data than just the player’s attributes.

However in the grand scheme of things, this model has an RMSE of about a $1.2 mil which in a real life scenario for a huge club would not be a big sum to be worried about.

This does not serve as a justification for this model but it points to this model being an effective yardstick for clubs to prepare their initial transfer bids.